# Program Execution Order for Generating Main Sample

## External Datasets Required

Before starting the execution workflow, you need to obtain these external datasets:

1. **Compustat Data**
   - Compustat annual data (`funda`)
   - Compustat quarterly data (`fundq`)
   - Compustat company information table (`company`)
   
2. **Supply Chain & Network Data**
   - Compustat segment data (`compustat_supplychain_update20210209`)
   - Factset supply chain pair relationships (`factset_update20210209`)
   - Vertical network data from Fresard, Hoberg, and Phillips (`VTNIC_89_2019`)
   - TNIC3-HHI and product similarity data from Hoberg and Phillips (`TNIC3HHIdata_1989_2019`)
   - Text-based industry peers from Hoberg and Phillips (`tnic3_data.txt`)

3. **Financial Constraints & Covenant Data**
   - Text-based financial constraints data (`TextBasedConstraintsDatabase`)
   - Dealscan covenant data (see SAS tables: `dealscan.financialcovenant`, `dealscan.networthcovenant`, `dealscan.facility`)
   - Dealscan-Compustat linking table (`dealscan_link.sas7bdat`)
   - Covenant violations data by Nini, Smith, and Sufi (Please Contact the authors for access)

4. **Ownership & Strategic Relationships Data**
   - Blockholder ownership data by Miriam Schwartz-Ziv and Ekaterina Volkova -- available in WRDS (`blockholders.csv`)
   - Joint venture data from SDC Platinum (`SDC_joint_ventures`)

5. **Linking Data**
    - GVKEY-CIK linking table (`gvkey_cik`)
    - GVKEY-CUSIP linking table (`gvkey_cusip_links`)
    - Compustat-DealScan linking tablef from Chava and Roberts (`dealscan_link.sas7bdat`)


## Execution Notes and Tips

1. **Format Conversions:**
   - Several steps require converting between Stata (.dta), SAS (.sas7bdat), and CSV formats for importing and exporting across languages. 

2. **Mixed Language Processing:**
   - SAS code is sometimes embedded as comments in Stata files. You can either extract and run separately, or run the separate SAS-only file provided. 

3. **Dependencies Installation:**
   - Ensure all required packages are installed (and their dependecies)
     - **Stata:** winsor2, astile, ffind, reghdfe, gtools, mdraws, mvpn, estout 
     - **Python:** pandas, networkx, numpy, matplotlib, pandasql
     - **MATLAB:** Spatial econometrics packages (see: https://spatial-econometrics.com/ and add them to your path). 
     - **MATLAB editted:** Please include the following file in your path to run the spatial model using dummies (non-standardized): 'EstimateSAR_d.m'


## Execution Workflow

### Step 1: Prepare Compustat Annual Base Dataset
**File:** `0_1_Compustat_cleaned.do`  
**Language:** Stata  
**Description:** Creates the basic Compustat dataset with key financial variables.
- **Inputs:**
  - funda_1980_2025feb19.dta (Compustat annual data from WRDS, or `funda`)
- **Outputs:**
  - __compustat_cleaned.dta

### Step 2: Prepare Compustat Quarterly Dataset in SAS
> ### Part (a): Data merging in SAS
**File:** `0_2_Sample_Creation_Quarterly.sas`  
**Language:** SAS  
**Description:** Prepares quarterly Compustat data, calculates rolling sums, and adds financial ratios.
- **Inputs:**
  - comp_fundq4 (Compustat quarterly dataset)
  - financialcovenant (Dealscan Covenant file from WRDS)
  - networthcovenant  (Dealscan Net worth covenant file from WRDS)
  - dealscan_link  (Dealscan link file with Compustat from Chava and Roberts)
  - Nini, Smith, and Sufi covenant violations data (Provided by the authors)
- **Outputs:**
  - comp_fundq6.dta
  - comp_covenants.dta

> #### Part (b): Sample processing in Stata
**File:** `0_2_Comp_Deal_dttd_violations.do`  
**Language:** Stata  
**Description:** Creates measures of debt covenant violations and distance-to-default threshold.
- **Inputs:**
  - comp_fundq6 (Created in Part a of this Step)
  - comp_covenants2 (Created in Part a of this Step)
- **Outputs:**
  - COMP_Q_dttd_violations.dta 
  - COMP_A_dttd_violations.dta 

### Step 3: Create Dealscan Loan-Covenant Measures
**File:** `0_3_Sample_Creation_Quarterly.do`
**Language:** Stata
**Description:** Processes the quarterly Compustat data from SAS, adds covenant violation measures, constructs slack variables, and prepares the data for regression analysis.
- **Inputs:**
  - comp_covenants.dta (from SAS processing)
  - COMP_Q_dttd_violations.dta (covenant strictness measures)
  - Ptn_TNIC_TSIMM.dta (partners' product similarity)
- **Outputs:**
  - __covenant_slack2_allVars.dta (The main quarterly dataset)
  - __covenant_slack2.csv (CSV version for analysis in Matlab)


### Step 4: PART 1 of Annual (main) Sample Creation 
**File:** `0_Sample_Creation.do`
**Language:** Stata
**Description:** Merges all datasets and creates the main annual sample. This is the first part of the annual sample creation process used to generate some intermediate datasets.
- **Inputs:**
  - __compustat_cleaned.dta (Step 1)
  - COMP_A_dttd_violations.dta (Step 3)
- **Outputs:**
  - __compustat_cleaned_allVars2.dta (Intermediate dataset, used for calculating indirect effects that are merger back in Step 14)


### Step 5: Create Supply Chain Network
**File:** `1_1_SC_Network.do`  
**Language:** Stata  
**Description:** Constructs the Supply Chain Network by combining Compustat, Factset, and VTNIC data.
- **Inputs:**
  - compustat_supplychain_update20210209.dta (Compustat segment data)
  - factset_update20210209.dta (Factset edgelist of SC partners)
  - Sales_cogs_85_2019.dta (Compustat data on sales and Cost of goods sold)
  - VTNIC_89_2019 (From Fresard, Hoberg, and Phillips. See: https://faculty.marshall.usc.edu/Gerard-Hoberg/FresardHobergPhillipsDataSite/idata/VertNetwork_10gran.zip)
- **Outputs:**
  - SCnetwork2.dta
  - SCnetwork2_saveold.dta (Same as SCnetwork2 but in old Stata format, which SAS can import)
  - SCnetwork2.csv (for use in Matlab and Python)

### Step 6: Create Quarterly Supply Chain Network
**File:** `1_1_SC_Network_Qtr.do`  
**Language:** Stata  
**Description:** Creates a quarterly version of the supply chain network by replicating annual data across quarters.
- **Inputs:**
  - SCnetwork2.dta 
  - SCnetwork2_saveold.dta
- **Outputs:**
  - SCnetwork2_Q.dta
  - SCnetwork2_Q.csv

### Step 7: Create 5-year Lagged Network
**File:** `1_2_SC_Network.sas`  
**Language:** SAS  
**Description:** Creates lagged (5-year) Supply Chain Network data for both annual and quarterly frequencies.
- **Inputs:**
  - SCnetwork2_saveold 
  - SCnetwork2_Q.csv 
- **Outputs:**
  - SCnetwork2_projected5.csv
  - SCnetwork2_Q_projected5.csv

### Step 8: Calculate Relationship Metrics
**File:** `2_1_SC_rel_link.do`  
**Language:** Stata  
**Description:** Analyzes network statistics including relationship duration and competitive similarity measures.
- **Inputs:**
  - SCnetwork2
  - VTNIC_89_2019
  - TNIC3HHIdata_1989_2019
- **Outputs:**
  - SC_rel_link (Supply chain relationship metrics -- pairwise)
  - SC_TNIC3HHI_P_by_supplier (TNIC similarity by supplier/customer)
  - SC_rel_length_all_directed (Relationship length metrics, directed)
  - VT_depth (VTNIC depth, or extent of possible partners)


### Step 9: Calculate Network Centrality Measures
**File:** `2_2_Network_Measures.py`  
**Language:** Python  
**Description:** Calculates network centrality measures (degree, clustering, closeness).
- **Inputs:**
  - SCnetwork2.dta
- **Outputs:**
  - centralities_full_network_1994_2019.dta
  - centralities_vtnic_1989_2019.csv
  - shortest_path (printed averages only, no export, takes long to calculate)

### Step 10: Calculate Indirect Effects (Column Sums)
**File:** `2_3_Indirect_effects.m`  
**Language:** MATLAB  
**Description:** Calculates firm-level measures of indirect effects using spatial econometric techniques.
- **Inputs:**
  - __compustat_cleaned_allVars2 (from Step 4)
  - SCnetwork2 
- **Outputs:**
  - SC_column_sums
- **IMPORTANT NOTE: Run the following Stata code to generate and save a Stata dataset with the Indirect effects (column sums):**
  ```stata
  import delimited "SC_column_sums.csv", clear
  rename columnsum columnsum_all
  winsor2 columnsum_all
  drop index
  save "SC_column_sums", replace
  ```

### Step 11: Calculate Column Sums Excluding VTNIC
**File:** `2_4_Indirect_effects_No_VTNIC.m`  
**Language:** MATLAB + Stata  
**Description:** Generates column sums of the spatial weights matrix excluding VTNIC connections.
- **Inputs:**
  - SCnetwork2
  - compustat_cleaned
- **MATLAB Outputs:**
  - SC_column_sums_No_VTNIC.csv
- **IMPORTANT NOTE: Run the following Stata code (available also as a comment at the end of the file) to generate the final dataset:**
  ```stata
  import delimited "SC_column_sums_No_VTNIC.csv", clear
  rename columnsum columnsum_noVTNIC
  winsor2 columnsum_noVTNIC
  drop index
  save "SC_column_sums_No_VTNIC", replace
  ```

### Step 12: Create Blockholder Network
  #### Part 1 - Stata:** `2_5_Blockholder_network.do`
  **Language:** Stata
  **Description:** Generates the blockholder network and merges it with the SC network.
    - **Inputs:** gvkey_cik, blockholders
    - **Outputs:** block_network, block_network
  #### Part 2 - SAS:** `2_5_Blockholder_network_SAS.sas`
  - **Inputs:** SCnetwork2, block_network
  - **Outputs:** SC_blocks.sas7bdat
  #### Part 3 - Stata:** `2_5_Blockholder_network.do`
  - **Inputs:** SC_blocks.sas7bdat
  - **Outputs:** sc_blocks_collapsed

### Step 13: Create Joint Venture Data
  #### Part 1 - Stata:** `2_6_Joint_Venture_Data.do` 
  **Language:** Stata  
  **Description:** Starts from raw SDC joint venture data and creates main sample
  - **Stata Part (First):**
    - **Inputs:** SDC_joint_ventures, gvkey_cusip_links
    - **Outputs:** JV_cleaned_short
  #### Part 2 - SAS** `2_6_Joint_Venture_Data(SAS).sas` 
  **Language**
  **Description** Middle part of the code - Run the SAS code embedded as comment in the Stata file):**
    - **Inputs:** SCnetwork2, JV_cleaned_short
    - **Outputs:** SC_alliances.sas7bdat
  #### Part 3 - Stata** `2_6_Joint_Venture_Data.do`
  **Description:** Final part of the code that calculates the final dataset to be used in the main sample
    - **Inputs:** SC_alliances.sas7bdat
    - **Outputs:** SC_alliances_collapsed


### Step 14: Create Final Dataset -- PART 2 of Annual Sample Creation
**File:** `0_Sample_Creation.do`  
**Language:** Stata  
**Description:** Integrates all previously created datasets, performs additional calculations, and creates the final dataset.
- **Inputs:**
  - compustat_cleaned_allVars2 (from Step 4)
  - SCnetwork2 (Creates SCnetwork2_partners internally)
  - TNIC3HHIdata_1989_2019
  - tnic3_data.txt
  - SC_rel_length_all_directed (Step 8)
  - VT_depth (Step 8)
  - SC_TNIC3HHI_P_by_supplier (Step 8)
  - SC_column_sums (Step 10)
  - SC_column_sums_No_VTNIC (Step 11)
  - sc_blocks_collapsed (Step 12)
  - SC_alliances_collapsed (Step 13)
- **Intermediate Outputs:**
  - compustat_cleaned_allVars3
  - P_Compustat_cleaned_allVars2
- **Final Outputs:**
  - __compustat_cleaned_allVars4 (Final Stata dataset)
  - __compustat_cleaned (Final CSV for MATLAB analysis)

### Step 15: Generate results 
  - Table 1: `Table1_(Sum_stats).do `

  - Table 2: `Table2_Network_Regs_PanelsAB.m `

  - Table 3: `Table3_Network_RDD.m `

  - Table 4: `Table4_(Trade_Credit).do `

  - Table 5: `Table5_(Eq+JV).do `

  - Table 6: `Table6_(Input_Specificity).do`

  - Table 7: `Table7_(Input_Specificity).do`

  - Table 8: `Table8_(Industry-Shocks).m`


